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Abstract. We consider sequential selection of an alternating subsequence 
from a sequence of independent, identically distributed, continuous random 
variables, and we determine the exact asymptotic behavior of an optimal se- 
quentially selected subsequence. Moreover, we find (in a sense we make precise) 
that a person who is constrained to make sequential selections does only about 
12% worse than a person who can make selections with full knowledge of the 
random sequence. 

Key Words: Bellman equation, on-line selection, sequential selection, 
prophet inequality, alternating subsequence 

Mathematics Subject Classification (2000): Primary: 60C05, 90C40; 
Secondary: 90C27, 90C39 



1. Introduction 

Given a finite (or infinite) sequence x = {xi,X2, . . .} of real numbers, we 

say tliat a subsequence a:^^ , , . . . , Xi^. , . . . with. 1 < ii < ^2 < . . . < ife < • • • is 
alternating if we fiave Xi-^ < Xi^ > Xi^ < Xi^ ■ ■ ■ . Wfien x is an element of the set 
of permutations iS„ of the integers {1, 2, . . . , n}, the study of the set of alternating 
permutations goes back to Euler (c.f. Stanley, 2010). 

Here we are mainly concerned with the length a{x) of the longest alternating 
subsequence of x. This function has been more recently studied by Widom (2006), 
Pemantle (c.f. Stanley, 2007, p. 568) and Stanley (2008). In particular, they 
consider the situation in which x is chosen at random from Sn- By exploiting 
explicit formulas for generating functions and delicate applications of the saddle 
point method, they were able to obtain exact formulas for the first two moments 
and to prove a central limit theorem. Specifically, if x is chosen according to the 
uniform distribution on the set of permutations 5„ and if An '■— a{x) denotes the 
length of the longest alternating subsequence of x, then for n > 4 one has 

A 1 2n 1 , r ^ 1 8n 13 

EL4„ = h - and VarL4„ = . 

^^36 ^ ^ 45 180 

More recently, Houdre and Restrepo (2010) used purely probabilistic means to 
obtain a simpler proof of this result and the corresponding central limit theorem. 



A. Arlotto: Wharton School, Department of Operations and Information Management, Huntsman 
Hall 527.2, University of Pennsylvania, Philadelphia, PA 19104. 

R.W. Chen: Department of Mathematics, University of Miami, Coral Gables, FL 33124. 
L.A. Shepp: Wharton School, Department of Statistics, Huntsman Hall 462, University of Penn- 
sylvania, Philadelphia, PA 19104. 

J.M. Steele: Wharton School, Department of Statistics, Huntsman Hall 447, University of Penn- 
sylvania, Philadelphia, PA 19104. 

1 



2 



ARLOTTO, A., CHEN, R.W., SHEPP, L.A., STEELE, J. M. 



Moreover, the methods of Houdre and Restrepo also apply to models of random 
words that are more refined than simple random selection from set of permutations. 

Here, we study the problem of making on-line selection of an alternating sub- 
sequence. That is, we now regard the sequence a;i,X2,... as being presented to us 
sequentially, and, at the time i when Xi is presented, we must choose to include Xi as 
a term of our subsequence — or we must reject Xi as a member of the subsequence. 

We will consider the sequence to be given by independent random variables 
Xi,X2, - ■ ■ that have a common continuous distribution F, and, since we are only 
concerned with order properties, one can without loss of generality take the distri- 
bution to be uniform on [0, 1]. We now need to be more explicit about the set 11 of 
feasible strategies for on-line selection. At time i, when presented with Xi we must 
decide to select Xi based on its value, the value of earlier members of the sequence, 
and the actions we have taken in the past. All of this information can be captured 
by saying that Tfe, the index of the k'th selection, must be a stopping time with 

respect to the increasing sequence of cr-fields, J^, = a{Xi,X2, . . . ,Xi}, i = 1,2, 

Given any feasible policy tt e 11 the random variable of most interest here is A° (tt), 
the number of selections made by the policy tt up to and including time n. In 
other words, A!^(Tr) is equal to the largest k for which there are stopping times 
1 < Ti < T2 < ■■■ < Tk < n such that {X^, Xr2, ■ ■ ■ ,Xti^} is an alternating 
sequence. 

Theorem 1 (Asymptotic Selection Rate for Large Samples). For each n = 1, 2, 
there is a policy tt* e 11 such that 



The proof of this result exploits the analysis of a closely related selection problem 
in which one considers a sample of size N where N is geometrically distributed with 

parameter < p < 1 (so one has P(A^ = k) = p''~^{l — p), k = 1,2,3, ) Here we 

also assume that N is independent of the sequence Xi,X2, 

Parallel to our first theorem, we consider the number A'j^{n) of selections made 
by a feasible policy tt up to and including the random time N. The geometric 
smoothing provided by N gives us a useful "shift symmetry" that is missing in 
the fixed n problem, and the analysis of a geometric sample turns out to be far 
more tractable. In particular, one can determine the exact expected length of the 
sequence selected by an optimal policy. 

Theorem 2 (Expected Selection Size in Geometric Samples). For each < p < 1, 
there is a w* G H, such that 



EK«)] = supEK(7r)], 




E[^^(7r*)] = supE[A^(7r)], 



wen 

and for such an optimal policy one has 



E[A^(.*)] = ^-^^r^y^ - (2 - V^)(l - P)-' 




as p — > 1. 
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These theorems respectively teh us that optimal on-line selection yields subse- 
quences that grow at a linear rate (2 - \/2)n 0.585 n or (2 - V2)EN 0.585EA^. 
This is about a 12% discount off the rate (2/3)n ~ 0.667ri that one would obtain 
with a priori knowledge of the full finite sample {^i, X2, Xn}, and this discount 
seem quite modest given the great difference in the knowledge that one has. 

To build some intuition about these rates, one should also consider the "maxi- 
mally timid strategy" where one chooses the first observation that falls in [0, 0.5], 
then one chooses the next observation that falls in [0.5, 1], and then the next that 
falls in [0, 0.5], and so on. This strategy obviously leads to an asymptotic selec- 
tion rate of 0.5 n. Finally, one should also consider the "purely greedy strategy" 
where one accepts any new arrival that is feasible given the previous selections. 
Curiously enough, by a reasonably quick Markov chain calculation one can show 
that the greedy strategy leads to the same selection rate 0.5 n that one finds for the 
"maximally timid strategy." 

We begin by proving Theorem [5] which will give us an exact formula for the 
expected number of selections made under the optimal policy for geometric samples. 
This result will then be used to prove the upper and lower bounds of Theorem [1] 



2. Proof of Theorem [5] 

We now let Si denote the value of the last member of the subsequence selected 
up to and including time i. To keep track of the up-down nature of our selections, 
we then set i?, = if Si is a local minimum of {5o, 6*1, . . . , 5'^} and set _Ri = 1 if Si 
is a local maximum. To initialize our process, we set Sq — 1 and Rq — 1. 

Next, we make the class 11 of feasible policies more explicit. For each 1 < j < cxo 
and for each pair a feasible strategy tt specifies a set Ai(5i_i, 

such that 

A,(5,_i,0) C [5,;_i,l] and A,(S,_i,l) C [0,5,_i], 

and Xi is selected for our subsequence if and only if Xi E Ai{Si-i, Ri-i). For each 
TT e n, we have the basic relation 

N 00 

i=l i=l 

and by taking expectations on both sides we have 



[A%{7:)]=E 



We came to this relation by considering random sample sizes with the geometric 
distribution, but the right side of this identity can also be interpreted as the infinite- 
horizon discounted expected length of the alternating subsequence selected by tt. 
We are interested in the policy tt* S 11 such that 



E[A%{tt*)] = supE 



Tren 



and from the general theory of Markov decision problems, we know that an optimal 
policy can be characterized as the solution of an associated Bellman equation. 
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First Bellman Equation. For any i such that Si-i ~ s and Ri-i = r we let 
w(s, r) denote the expected number of selections made after tune i by an opti- 
mal policy. By the lack of memory property of the geometric distribution and by 
the usual considerations of dynamic programming one can now check that v{s,r) 
satisfies Bellman equation: 



(1) v(s,r) = 



psv{s,0) + jj^ max {/9u(s, 0), 1 + 1)} da; if r = 

p{l — s)v{s, 1) + max{pw(s, 1), 1 + pv{x, 0)} dx if r = 1. 



To see why this equation holds, first consider the case when r = (so the next 
selection is to be a local maximum). With probability p we get to see another 
observation Xi^i and with probability s the value we observe is less than the 
previously selected value. In this case, we do not have the opportunity to make a 
selection, and this observation contributes the term psv{s, 0) to our equation. 

Next, consider case when s < Xi+i < 1. Now one can choose to select X^+i = x 
or not. If we do not select X^+i = x the expected number of subsequent selections is 
pv{s, 0) and if we do select Xi+i = x we increment sequence by 1 and the expected 
number of subsequence selections that are made by an optimal policy in the future 
given by pv{x, 1). Since Xi^i is uniformly distributed in [s, 1] the expected optimal 
contribution is given by the second term of our Bellman equation (top line). The 
proof of the second line of the Bellman equation is completely analogous. 

Finally, given a solution v{s, r) to the Bellman equation ([IJ, we have 

v{l,l)=E[A%{7r*)], 

so, now our goal is to determine w(l, 1). To do this it will be useful to reorganize 
the Bellman equation ([T|) in a tidier form. This is possible since the solution v{s, r) 
of the Bellman equation has a useful symmetry property. 

Lemma 3 (Reflection Identity). For all s G [0, 1] the solution v{s, r) of the Bellman 
equation ([1} satisfies 

(2) u(s,0) =w(l-s, 1). 

Proof. The Bellman equation ([T]) is a fixed point equation, and by the classical 
theory of dynamic programming it can be solved by iteration (c.f. Bertsekas and 
Shreve, 1978, Sec. 9.5). We will prove the identity ([2|) by showing that it holds for 
the sequence of approximations, so it also holds for the limit. 

We first set v^{s, r) —Q for all (s, r) e [0, 1] x {0, 1} and we note that trivially 
satisfies the Refiection Identity ([2]). Next, for our induction hypothesis, we assume 
that we have w"~^(s, 0) = w"^^(l — s, 1) for all s e [0, 1]. The next iterate in the 
sequence is then given by 

w"(s,0) =/9sw"-i(s,0) + ^ max{pt>"-i(s,0),l + /9w"-i(a;,l)} dx. 

By applying our induction hypothesis on w"^^ we then obtain 

w"(s,0) = psw"-i(l - s,l) + [ max{pu"-i(l-s,l),l + pw"-i(l-a;,0)} dec. 
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Now, after changing variables in the integral on the right-hand side, we find 

w"(s,0) =pst;"~i(l-s,l)+ / niax{pw"-\l--s,l),l + pw"-i(a;,0)} 

Jo 

and this completes the induction step. Now, for all (s,r) € [0, 1] x {0, 1} we have 
v^{s, r) — > v{s, r) as n — cx) so taking limits in the last identity completes the proof 
of the reflection identity. □ 

A Simpler Equation. Using the reflection identity ([2]) we can put the Bellman 
equation ([Ij into a more graceful form. Specifically, if we introduce a single variable 
function v{y) defined by setting 

v{y) = v{y,0) = v{l - y, 1), 

then substitution into our original equation ([T]) gives us 

(3) v{y) = pyv{y) + / max{pv{y),l + pv{l — x)} dx. 



Here we should note that by the definition of v{y) — v{y,0) we have that v{-) is 
continuous, t;(l) = 0, and v is non-increasing on [0, 1]. We will show shortly that v 
is actually piecewise linear and it is constant on an initial segment of [0, 1]. 

An Alternative Interpretation. The symmetrized equation ([3]) can be used to 
obtain a new probabilistic interpretation of v{y). To set this up, we first put 

(4) riy) = M{x e [y, 1] : pv{y) < 1 + pv{l - x)}. 
With this definition, we can rewrite (jS]) as a bit more nicely as 

(5) viy) ^ pj*{y)v{y) + I {1 + pv{l - x)} dx. 



'f'iy) 

Thus, one removes the maximum from the integrand ([3]) at the price of introducing 
a threshold function /* that depends on v. 

We now recursively define random variables {K; ; i = 1, 2, . . .} by setting Yq — 
and taking 

V,_i ifx, </*(y,_i) 
i-x, ifx, >/*(y,_i), 

and finally introduce a new value function 



Y, 



(6) v„{y) = E 



Y.p'-'HX,>r{Y^^)) 



Yo = y 



The next proposition shows that wo(y) is actually equal to v{y). As part of the 
bargain, we obtain a concrete characterization of the threshold function /*. 

Proposition 4 (Structure of the Solution of the Bellman Equation). We have the 
following characterizations off* and vq: 
(i) There is a unique i^o G [0, 1] such that 

= max{^o,2/} for all < y < 1, 
and moreover < < 1/2. 
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(ii) The function vq{-) is a solution of the Bellman equation ([3]), so, by uniqueness, 
we have vo{y) = v{y) for all < y < 1. 

Proof. From the definition of /* we see that 

(7) pviy) < I + pvil - y) f*{y)^y. 

Now, for 1/2 < y we have 1 — y < y, so the monotonicity of v gives us the bound 
pv{y) < 1 + pv{l — y); consequently, we have f*{y) — y for y S [1/2, 1]. 

If the condition © holds for all y e [0, 1/2), then f*{y) = y for all y e [0, 1] and 
we can take = 0. Otherwise there is a j/o G [0, 1/2) for which we have 

1 + pv{l - ya) < pv{yo). 

For A(y) = 1 + pv{l~y)~ pv{y) we then have A(yo) < and A(l) = l + pv{0) > 0, 
so by continuity we have S — {y : A{y) = 0} ^. If we now take to be the 
infimum of S, then € [yo, 1/2) C [0,1/2) and pv{^o) = 1 + pv{l - Co)- The 
definition of /* now tells us that f*{y) — £,o for y < £,q and f*{y) — y for < U- 
This completes the proof of the first part of the proposition. 

Finally, to check that vq solves the equation ([6]), we just condition on the value of 
Xi and calculate the expectation of the sum. When we take the total expectation, 
we get the right side of jS]). □ 

Characterization of the Critical Value. Now that we know that the thresh- 
old function /* for the solution of Bellman equation ([3]) has the form f*{y) — 
max{Coi2/} for some G [O7I/2), the main problem is to find ^o- The natural 
plan is to fix C G [0, 1/2] and to consider a general selection function of the form 
f{y) = max{C, y} = (^Vy). We then want to calculate the associated value function 
and to optimize over ^. 

The associated value function is given by 



(8) Viy,^,p)=E 



>maxU,r,_i}) 



Yo=y 



and Proposition m then tells us that 



max V{y, C, p) ^ v{y) for aU y e [0, 1]. 

je[o,i/2] 

If we abbreviate V{y, p) by setting V{y) = V{y, C, p), then by conditioning on Xi 
in equation ([8|) we see that V{y) satisfies the integral equation 

V{y) = (e V y)pV{y) + I {1 + pV{l - x)} dx 

/■l-(«Vy) 

(9) ={^^y)pV{y)+ {l + pV{x)}dx. 

Jo 

This equation has several attractive features. In particular, if we set y = 1 then 
from < p < 1 we see V{1) = 0. Also, by writing 



1 



i-(?vy) 



we see that the right side does not change when y e [0, f], so we have 
(10) Viy)^V{y') for all < y, 2/' < e 
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From now on, we will let V'{^) denote the right derivative of the integral equation 
(j9} evaluated at and let V'{1 — £,) denote the left derivative of dU evaluated at 
1 — ^. Elsewhere V'{y) simply denotes the derivative of ^ evaluated at y. 

Lemma 5. The solution of equation Q satisfies the following four conditions: 

(i) v{i-0{i-p + pO = C + p^v{0; 

(ii) V'{0{i-p0 = p[v{0-v{i^0]~i; 

(iii) v'ii - 0(1 -p + pO = p[vii -0- vm - 1; 

(iv) v'ii^Oi^-p+pO'{i-pO = v'iOii-pOH^-P+pO+i^-P+pO'-a-pO'- 

Proof. Conditions are easy to check. Condition Q is just ^ evaluated at 

1 — ^ together with pH)) . Conditions (ju]) and (plil) simply follow by evaluating © 
at ^ and 1 — ^ respectively and by differentiating both sides with respect to ^. 

The proof of Condition (pv|) requires more work. Consider ?/ e (Ci 1 ^ that 
the integral equation ^ becomes 

V(y)^ypV{y)+ f \l + pV (x)} dx . 



Differentiating once we have 

(11) V'iy)il - py) - p[V{y) - V{1 - y)] - 1, 
and differentiating again gives us 

(12) V"{y){l - py) - pV'{y) = pV {y) + pV'{l - y). 

To estimate the value of V'{1 — y) we note that 1 — y e (0 1 ~ Oi ^"^^ '^^ evaluate 
the integral equation ^ at l — y. We then differentiate with respect to y to obtain 

(13) V\l - y){l ~p + py)^ p[V{l - y) - V{y)] - 1. 
By combining equations and ()13p we then have 

V'{1 - y) = (1 - p + py)-\~V'{y){l - py) - 2), 
which we can plug into the last addend of ([T2|) to obtain 

(14) V"{y){l - py)[l -p + py)^ V\y)p{l ~2p + ipy) - 2p. 

By multiplying both sides of by (1 — py) we obtain the critical identity 

(15) V"{y){\ - py)^{\ -p + py) = V'{y)p{\ - py){\ -2p + ipy) - 2p(l - py). 

For h{y) = (1 — py)^{^ — P + py) notice that h'{y) ~ — p(l — py)(l — 2p + Spy), so 
that we can rewrite the identity (jlSp as 

r'{y)h{y) + V'iy)h'{y)-[il-py)']' = 0. 

An immediate integration then gives us 

V'{y)h{y)-il-py)^=C, 

where C is a constant, and if we take C = V'{£,)h{(^) — (1 — p^)'^ we find 

(16) n.) = ne)^+ ^^'^^^lj'^^^^ foralU<.<l-e 
Finally, on setting y — 1 — ^ we recover the desired condition (|Iv)) . □ 
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Calculation of the Critical Value. Conditions ([i|-(Iivl in Lemma [5] generate a 
system of four equations in four unknowns, V{^),V{1 — £,),V'{(,), and V'{1 — £,). 
By solving tliis system one finds 

2 - 2e - p + 2pe - 2pe 



(17) ViO 

(18) V'iO 

y'(i-0 



2(1-p)(1-pO 

p{2- -p^+ - 2p^e) 

2{1-p}(1-pOHI-P + pO 
-2 + 4p - 4pC -p^ + 2p^C^ 

2ii-pOHi-p + pO 

-2 + 4pe + - 4p2^ + 2p2^2 



2(i-pO(i-p + pO' 

Finally, by substituting (fTS)) into (|T6)) we get 



2(l-p + py)(l-py)^ 

Now, given any ^, we want to compute V{0,£,,p). We first recall that we have 
V{l,C,p) = and V{y,^,p) = V{^,^,p) for aU < y < We therefore find that 
-^V{y, f , p) = on < ?/ < ^, so on integrating we have 

V{l,^,p)-V{0,^,p)^ [ V\y)dy^ [ V {y) dy 
Jo 

and hence ^ 

V{0,^,p) = - [ V'{y)dy. 



We now optimize this last quantity with respect to ^. By differentiating both sides 
with respect to ^ we get 

^^V{Q,i,p)^V\i) 
and we are interested in the value such that 

^^'(^o) = 0. 

Our formula ^ for V^'(^o) tells us that F'(^o) = if and only if 

2(1 - pCo)' - (2 - p)^ 
We therefore find that the unique choice for is given by 

A routine calculation verifies that T^"(^o) < 0, so we have found our maximum. 
When we evaluate l^(Co,^o,p) using equation p?)) . we find 

, \ 3-2V2-P + PV2 
V{M.P) = , 

and this gives us the main formula of Theorem [51 From this formula it is immediate 
that 

lim(l-p)y(eo,Co,p) = 2- V2, 
pti 

so the proof of Theorem [2] is complete. 
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3. Proof of Theorem [T] from Theorem [2] 



We will use our results for geometric sample sizes to get both lower and upper 
bounds for the finite sample size selection problem. The lower bound is the easiest. 
For fixed 71, one can use the (now suboptimal) policy from an appropriately chosen 
geometric sample size problem. The proof of the upper bound is considerably 
harder, and the method will be described later in this section. Before making these 
arguments, we need to organize a few structural observations. 

Selection Policies and a Bellman Equation for Finite Samples. When the 
sample size n is deterministic and known, the feasible policies need to take this 
information into account. In particular, the selection thresholds will no longer be 
stationary; they will depend on the number of sample elements that remain to be 
seen. 

Just as in Section[21 we consider the pairs (^i-i, 1 < i < where Si-i 

is the size of the last selection made before time i and Ri-i is or 1 accordingly 
as the last selection was a local minimum or a local maximum. A feasible policy 
TT e n again specifies a set Ai.„(S'i_i, Ri-i) that depends only on past actions, but 
now we have dependence on the decision time i and on the sample size n. For any 
policy TT G n the expected size of the selected sample can then be written as 



In this case, an optimal policy can be characterized as the solution to a finite sample 
Bellman equation. Specifically, for 1 < i < n, we have 



and the backward induction begins by setting ti„+i. „(s,r) = for all (s,r) in 
[0, 1] X {0, 1}. This equation is justified by the same considerations that were used 
in the derivation of equation ([1]). 

Symmetry and Simplification. For the finite sample size problem, one loses 
much of the nice symmetry of the geometric sample size problem. Nevertheless, 
the solution of the finite sample Bellman equation still has a reflection identity 
analogous to that given by Lemma [3l 

Lemma 6. The solution of the finite sample Bellman equation satisfies 

(20) i'i,n(s, 0) = Wi,„(l — s, 1) for all I < i < n and all s G [0, 1]. 

Proof. Again we use an induction argument, but this time we do not need to take 
limits of an infinite sequence of approximate solutions. Instead we simply use 
backward induction and always work with exact solutions. 

Since we have w„.„(s, 0) = 1 — s and Vn.n{^ ~ s,l) = 1 — s, we see that equation 
([20| holds for i — n, so we suppose by induction that Vi+i^nis, 0) = Wi-(-i^„(l — s, 1). 
One then has 



n 



EK(7r)] =E ^1(X, e A,,n{S^-l,R^-l)) 



.i=l 






svi+i,n{s,0) + / ma.x{v.,+i,n{s,0),l + v.,+i,n{x,l)} dx, 
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SO by applying the induction hypothesis on the right-hand side one obtains 

Vi,n{s,0) = sui+i,„(l - s, 1) + J max{vi+i,„(l - s, 1), 1 + Wj+i,„(l - x,0)} dx. 

If we now change variable in this last integral, we get 

Vi,n{s,0) = svi+i_n{l - s, 1) + / max{i;i+i,„(l - s, 1), 1 + Vt+i^n{x,0)} dx 

Jo 

= Vz,n{^ - S, 1), 

and this completes the induction step. □ 
We can now define a new single variable function Vi^niu) by setting 

(21) Vi^niy) = Vi^n{y, 0) = - y, 1) 

and, by substitution into the original finite sample Bellman equation we have 

(22) Vi^ri{y) ^ yvi+iAy) + I max{i;i+i^„(?/), l + t;i+i,„(l - x)} dx. 

Jy 

Here we should also note that i'i_„(-) is continuous and non-increasing on [0, 1] for 
all 1 < i < n. 

The Threshold Functions. We now define the finite-sample equivalent of the 
threshold function @ by setting 

(23) /*„(y) = inf{x e [y, 1] : v.+i^y) < 1 + ^^+lA^ " x)}- 
If we then set Yq = and define Yi recursively by setting 

ifx, </*„(r,_i) 
i-x, ifx>/*„(r,_i). 



(24) 



then, in complete parallel to the geometric case, we see that the solution of the 
finite sample Bellman equation can be written more probabilistically as 



(25) vi.niy)^E 



^i(x, >/*„(r,_i)) 



Yo^y 



Finally, from equation (|2Tj) we have 

vi,n{0) = vi.niO, 0) = i;i,„(l, 1) = EK«)], 
and this gives us the last piece of structural information that we need. 

Proof of the Lower Bound. To prove that 

(2 - V2)n < E[^°«)] for all n > 1 

we only need to choose a good suboptimal policy. We now fix ^ G [0, 1/2] and we 
consider the policy in which Xi is selected if and only if Xi > max{^, Here, 
Yq ~ y is in the interval [0, 1 — ^] and the F^'s are defined recursively by setting 



Y 



Fj-i if X, < max{^,r,_i} 
1-X, if X, >max{e,r,_i}. 



The sequence {1^^:1 = 0,1,...} is a discrete-time Markov Chain on the state space 
[0, 1 — For a measurable AC [0, 1 — ^] we let \A\ denote the Lebesgue measure 
of A, and for a measurable set B C [0,1 — ^] we write 1 — i? as shorthand for the 
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set {m e [0, 1] : 1 — u G B}. Given these abbreviations, the transition kernel of the 
process {Yi : i = 0,1, . . .} can be written as 

K{x,c) = i{x e c){cyx) + |i -cn [^vx,!]]. 

It is now easy to check that the process {Yi} has a unique stationary distribution 7, 
and in fact 7 is just the uniform distribution on [0, 1 — C]i = (1 — f)^^|C| 

for all measurable C C [0, 1 — ^]). 

For any starting value Yq = y E [0,1 — the suboptimality of the selection 
functions max{^, gives that 



E 



> max{^,K,_i}) 



Yn 



y 



< vi,niy)- 



Since i'i,n(y) is non-increasing in y, we see that for any starting distribution fi 
supported on [0,1 — ^] one has 



E„ 



E 

.i=i 



liX, > max{e,y»-i}) 



If one chooses the starting distribution n to be the stationary distribution 7, then 



(26) E^ 



E 

.1=1 



1{X, > maxU,F,_i}) 



riE-, [1 



maxU,yo})]<EK«)], 



and we can compute the first expression explicitly. So, we have 



E^ [1 



max{^,yo}] 



1 



1-e 



1-? 



1 - max{^, y} dy 



l-2e 



2(1-0 

We can maximize this by taking ^ = 1 — 2^^/^ (as in (fT9| when p — 1), and we 
then obtain 

E^ [1 - max{C, Yq})] = 2 - %/2. 
Together with the inequality (I26|) . this completes the proof of our lower bound. 

Proof of the Upper Bound. The proof of the upper bound in Theorem [T] re- 
quires a more sustained argument. Unlike the problem for geometric samples, the 
value function Wi,n(-) is no longer constant on an initial segment of [0,1]. Nev- 
ertheless, the next proposition tells us that the value function does have a useful 
uniform boundedness on an initial segment. This is the first of several structural 
observations that we will need to obtain our upper bound for E[^°(7r*)]. 

Proposition 7 (Value Function Initial Segment Bounds). For all < u < 1 /6 and 

n >2, the functions Wj^„(-) defined by the Bellman recursion (j22p satisfy 

(i) 1 < Vi,n{u) — Vi,n{5/6), for all 1 < i < n 1; 
(u) v.^niu) - v,,ni^/6) < I, for all I < 1 < n. 

Moreover, for n > 3, the threshold functions /j*„(y) defined by equation (|23p are 
guaranteed to satisfy 1/6 < /j*„(y) for y G [0, 1] and 1 < « < n — 2. 

Naturally enough, the proof of this proposition depends on inductive arguments 
that exploit the defining Bellman equation. The first of these arguments gives us 
some control over the changes of Vi^n{u) when we change both i and u. 
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Lemma 8 (Restricted Supermodularity). For y E [0,1/2] and u E [y,l — y] the 
functions {'Ui,n(')} defined by the Bellman recursion (|22p satisfy 

Vi+i^„{u) - Wi+i,„(l -y) < Vi,n{u) ~ Vi,nil - y) for alll <i<n. 

Proof. We use backward induction on i, and, since n is fixed, we abbreviate fi,n(-) 
by Vi{-). For i — n we have Vn+i{u) = for all u S [0, 1]. Moreover, w„(u) = 1 — u 
and Vn{l — y) ~ y, so we have 

Vn+iiu) - w„+i(l ~y) < Vniu) - u„(l - for all u e [y, 1 - y]. 

Now, for our backward induction, we can assume more generally that 

Vi+i{u) - Wj+i(l - y) < i'i(w) - Vi{l - y) for all u E [y,l- y]. 

The Bellman equation (f22|) then gives us 

-yi_i(u) - Wi_i(l - y) = uwi(M) + / max{wi(u), 1 + ?;i(l - x)} 



- (1 - y)wi(l - y) - / max{'i;i(l - y), 1 + Wj(l - a;)} da;, 
Ji-a 

and, since it < 1 — y, we can break up the first integral to obtain 

/■i-a 

Vi-i{u)-Vi-i{l - y) = uvi{u)-{l - y)vi{l - y) + max{-yi(M), 1 + ^^(1 - a;)} dx 

J u 

(27) + / max{wi(w), 1 + Wi(l — a;)} — max{ui(l — y), 1 + t;i(l — x)} dx. 

Jl-y 

For X E [1 — y, 1], we have Wi(y) < Wi(l — x) since Vi{-) is non-increasing on [0, 1]. 
Therefore, since y<w<l — ywe have Vi{l ~ y) < Vi{u) < Vi{y) so that for 
X E [1 — y, 1] we have 

max{t;i(u), 1 + Vi{l — x)} ~ max{wi(l — y), 1 + 17^(1 — a:)} = 1 + Vi{l — x), 

and we see that the integral (|27|) equals 0. We now have just the identity 

/•i-a 

Vi_i{u)-Vi-i{l-y) ^uvi{u)-{l-y)vi{l-y)+ / raax{vi{u),l + Vi{l - x)} dx 

J u 

or, equivalently, 

Vi^i(u) - Vi-i{l - y) = u{vi{u) - Vi(l - y)) 

+ / max{vj(w) - Vi{l - y), 1 + Wj(l - a;) - Vi{l - y)} dx. 

J u 

Changing variables in this last integral then gives us the convenient identity 

Vi-i{u) - Wi_i(l -y) =u (vi(u) - Vi{l - y)) 

(28) +/ vcL&yi{vi{u) - Vi{l - y),l + Vi{x) - Viil - y)} dx. 

Jy 

Since y < u and 1 — ?i<l~y, we can now use our induction assumption to obtain 
Vi-i{u) - Vi-i{l -y) >u (Wi+i(w) - Wi+i(l - y)) 

pl—U 

+ / ma.x{vi+i{u)-Vi+i{l - y), l + Wi+i(.x)-w,:+i(l - y)} dx 



Vi{u) - v,{l - y), 
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where the last equality follows from the recursion ([28l) . □ 

We can now complete the proof of the Value Function Bounds in Proposition [T] 

Proof of Proposition^ We begin by proving Q by backwards induction on i. As 
before, since n > 2 is fixed, we abbreviate Wi,n(-) by Vi{-). For i = n — 1 one 
iteration of the recursive definition of the Bellman equation (|22|) gives us that 
Vn-iix) = (3/2)(l-a;2), so - ?;„_i(5/6) = (3/2)(25/36 - m^) > i since by 

hypothesis we have u < 1/6. We now make the induction assumption 

1 < v.,+i{u) - Vj+i(5/6) for < u < 1/6, 

and observe from the Bellman equation (|22p that 

Vi{u) - Vi{5/6) ^ uvi+i{u) + / ma,x{vi+i{u),l + Vi+i{l ~ x)} dx 



- 5/6ui+i(5/6) - / max{vi+i{5/6),l + Vi+i{l- x)}dx. 
J5/6 

Since u < 5/6, the monotonicity of Wi(-) implies Ui+i(5/6) < So, for 

X S [5/6, 1], we have max{wi-|_i(5/6), l+Vi+i(l— x)} < max{wi4.i(u), l+w^+i (1— x)}. 
This gives us the lower bound 

u (w,+i(m)-Wj+i(5/6))+ / max{ui+i(w)-Ui+i(5/6), l+Wj+i(l - x)-Vi+i{5/6)} dx 

J u 

< Vi{u) - Vi{5/6). 

To get a lower bound for the integral of the maximum, we replace the integrand 
by Vi+i{u) — Vi+i{5 /6) on [w, 1/6) and replace it by l + Wi+i(l — x) — Vi+i{5/6) on 
[1/6, 5/6]. Changing variables then gives us 



1 



6 



5/6 



(29) -{v,+i{u)-v,+ii5/6))+ / {l + v,+,ix)~v,+i{5/6)}dx<v,iu)-v,{5/6). 



1/6 



By our induction hypothesis the first addend satisfies the bound 

(30) i < i(«,+i («)-«,+! (5/6)), 
and by Lemma [8l the second integral satisfies the bound 

f5/6 1-5/6 

{1 + Vn{x) - Vn{5/6)} dx < / {1 + Vi+i{x) - Vi+i{5 /&)} dx. 

'1/6 ^1/6 

If we now recall that Vn (x) — 1 — x and compute the integral on the left-hand side 
we then obtain 

32 /"^^^ 

(31) — < / {l + v,+i{x)-v,+i{5/6)}dx. 
36 Ji/e 

Finally, adding (pO)) and (|3T|) and recalling (|29|) gives us our target bound 

38 

1 < TTT <v^{u)-v,{5/6). 
3d 

To prove condition ^ we again use backwards induction. For i = n we have 
Vniu) = 1 — u, SO Vn{u) — w„(l/6) = 1/6 — u < 1. Supposc now that 

Vt+i{u) - ?;,+i(l/6) < 1 for < u < 1/6. 
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The Bellman recursion ((22|) then gives us 

/.1/6 

Vi{u) - Vi{l/6) < / max{ui+i(u)-i;j+i(l/6), l+u,+i(l - a;)-Ui+i(l/6)}dx 
Jo 

1-5/6 

+ / max{ui+i(u), 1 + Ui+i(a;)} - max{wi+i(l/6), 1 + da; 

Jl/6 

+ / max{vi+i{u), 1 + - a;)} - niax{wi+i(l/6), 1 + Vi+i{l - x)} dx. 

J5/6 

For X € [0, 1/6], we can check that first integrand is bounded by 1. To see this, 
we first note that left maximand is bounded by 1 by the induction assumption. 
Next, we note that Vi^i{l — x) < Wi+i(5/6) so, for the second maximand one has 
the bound 1 + — x) — Wi+i(l/6) < 1 + Vi+i{5/6) — Ui+i(l/6) and this last 

term is non-positive by the inequality 

For X e [1/6, 5/6] the second integrand is bounded by 

max{?;i+i(u) - Ui+i(l/6), 1 + - Ui+i(l/6)} < 1, 

since both maximands are bounded by 1; the first one because of the induction 
assumption and the second one because it is non-increasing in x and attains its 
maximum for x — 1/6. 

Finally, for x Cz [5/6, 1] the third integrand is bounded by 

max{ui+i(w) - 1 - Ui+i(l - a;),0} < 

since —Vi^i{l — x) < — Wi+i(l/6), and by the induction assumption, we see that the 
left maximand Vi^i{u) — 1 — 11^+1(1/6) is also non-positive. So, at last we have 

v,{u)-v^{l/6) <5/6< 1, 

and this completes the proof of condition 

The last claim of Proposition [7] is that 1/6 < flniu) for all y S [0,1] and all 
1 < i < n — 2, n > 3. If y G [1/6, 1] this bound is trivial since y < flniv) fo^ 1 — 
i < n. If 2/ G [0, 1/6), then the inequality (P gives us that 1 < Wi+i.„(y) — Wi+i.„(5/6) 
for all 1 < i < n — 2, so that the definition of /*„(?/) in ([25)1 gives the required lower 
bound. This completes the proof of Proposition [71 □ 

Proof of the Upper Bound — The Last Step. We now have the all the tools 
that we need to prove that there is a constant C < 11 — 4a/2 ^ 5.343 such that 

E[^° «)] < (2 - V2)n + C for all n > 1. 

We first note that the bound is trivial for n — 1 and n — 2. For n > 3 we 
let {/r„, ■ • ■ , /jj „} denote the optimal threshold functions determined by recursive 
solution of the Bellman equation (|22|) for the finite horizon problem with sample 
size n. We will use the first n — 2 of these functions to construct a suboptimal 
selection policy for the geometric sample size problem. From the suboptimality of 
this policy we will obtain an inequality that will lead to our upper bound. 



ON-LINE ALTERNATING SUBSEQUENCES 



15 



Construction of a Suboptimal Policy for the Infinite Horizon Problem. 

We now consider the infinite horizon problem, and, as before, we let {Xi,X2, ■ ■ ■} 
denote the sequence of observations. Here is our selection process: 

• We let Tq denote the index of the first observation in the sequence that falls 
in the interval [5/6,1]. We select that observation as first element of our 
subsequence and we set Yxg = 1 — ■ We note that Yt,, has the uniform 
distribution in [0, 1/6]. 

• Next we use the functions {/^ „, . . . , fn-2 «} decide which of the next 
71 — 2 observations are to be selected. Specifically, we make our I'th selection 
in the series if Xtq+i > /*„(lTo+i-i) where as usual the Yxg+i are defined 
by the recursion 



Here one should recall that by Proposition [7] we have 1/6 < /*„(^TQ+i-i) 
for 1 < i < n — 2, so we have < Irb+i <5/6for 1 < i < n — 2. 
• We will now show how our selection process can be repeated in a stationary 
way. For fc = 0, 1, 2, . . . we proceed as follows: 
1. If YT^+n-2 e (1/6, 5/6], then we let 

Tk = inf{i > 1 : XT^+n-2+i > 5/6}, 

and we select the observation XT^+n-2+Tk- We note that the random 
variable YT^^n-2+Tk = 1 ^ XTk+n-2+T,, is uniformly distributed on 



2. If lTfc+ri-2 < 1/6 , then we simply let = 0, and we again note that 
YT^^n-2+Tk is uniformly distributed on [0,1/6]. 

3. We set Tk+i —Tk+n — 2 + Tk and set fc = fc + 1. 

4. Just as in the second bullet, we use the functions {/* „, . . . , fn-2 n} to 
decide which observations to select from {Xt^^+i, XTk+n-2}- 
At time Tk + n — 2 we are left with some value iTfc+n-2, and we return 
to Step 1 of this bullet. 



Analysis of the PoUcy. The suboptimal policy we constructed provides us with 
an increasing sequence of stopping times < Tq < Ti < T2 < ■ ■ ■ such that the 
times {Tk : fc > 1} are regeneration times for the process {Yi : i > Tq}. Moreover, 
we also have an i.i.d. sequence of stopping times {t^ : fc > 1} with distribution 



These regeneration times {Tk : fc > 1} can be written as function of the stopping 
times {Tfc : fc > 1}; specifically, we have 




YTo+i~l if Ato+i < flnO^To+i-l) 
1 - Xto + i if Xxo+i > flnO^To+i-l)- 



[0,1/6] 




< 1/6 
> 1/6. 



(32) 




16 



ARLOTTO, A., CHEN, R.W., SHEPP, L.A., STEELE, J. M. 



For any pair (T^jYts,), I < k < oo, the number r{Tk,YT^) of selections made 
from {Xx^jj-i, . . . , is then given by the sum 



n-2 



1=1 

For each < p < 1, the selection process described gives us a feasible policy that 
lower bounds the expected length - E[^^(7r*)] - of the alternating subsequence 
selected by an optimal policy form a sample of geometric size. 

Moreover, if for discounting purposes we view the number of selections r (T/j , Yx^ ) 
as being counted all at time Tfe+n— 2, then we obtain a lower bound for the expected 
value achieved by our suboptimal policy. We therefore have the bound 



(33) 



E 



L/£=o 



<E[A%{7r*)] 



We now note that Tq and Ij;, are independent, and we also note that for each 
fc > 1, the post-Tfc process {IVt+j : « > 0} is independent of Tfc. Consequently, we 
have the factorization 



(34) 



Tk+n-2 



r(rfc, FtJ] = E [^^"+"-2] E [r{Tk, FtJ] for ah fc > 0, 



and since Tk is a regeneration epoch we also have 

E [r (Tfe , FtJ] = E [r (To , Yt, )] for all fc > 0. 
For Yxg — y ^ [0, 1/6] we recall the identity (1251) and we observe that 

«i,„(y)-2<E[r(ro,yTo)|i"To = 2/] 

since the policy of the right-hand side agrees with the policy of the left-hand side 
for the first n — 2 observations, and the policy of the right-hand side never selects 
the last two. 

The monotonicity of wi,n(-) and the inequality (|ii]) of Proposition [7] then give us 
the lower bound 

EK«)] - 3 = «i,„(0) - 3 < E[r(To, yTo)|>To = v] for ah < y < 1/6, 
so by recalling that < Ytq < 1/6 and taking total expectations we see that 

EK«J]-3<E[r(ro,i"To)]. 

The factorization (p4)) then gives us the bound 

E [^^"+"-2] (EK«)] - 3) < E [p^'^'+"-V(Tfc, FtJ] for all k > 0. 

If we now sum over fc, use the representation p2p and use the suboptimality con- 
dition p3p . then we have 



(35) 



(EK«)]-3)E 



pro + («-2)(fc+l)+Ej=i rj 



.k=0 



<E[A%{7T*)]. 



We now note that Tq is also independent from the random variables {t^ : fc > 1}, 
and we recall that the r^'s are i.i.d., so 



E 



pTo + (n-2)(fe+l) + Ej=i r, 



E[p^°]5^p("-2)(fc+l)j£j^nj* 
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Since x ^ is convex, Jensen's inequality tells us that p^^° < E[p'^'>] and that 
p"^^! < E[p^i], so we have 

OO CXD 

k=0 k=0 

The left-hand side is an easy geometric series, and by substitution in equation psp 
we obtain the crucial bound 

1 _ n-2+ETi 
EK«)]<3+ nA%{7T*)]- 

From the explicit formula for E[^^(7r*)] in Theorem [2] we then have 

EK(^„)J < 3 + ^ETo+n-l(l„p) ■ 

The bound above holds for all < p < 1, so by letting p ^ 1 we obtain 

E[A° «)] < 3 + (2 - V2){n - 2 + En) < (2 - V2)n + 11 - 4\/2 
since E[ri] < 6. This completes the proof of the upper bound. 

4. Observations on Methods and Connections 

Our principal goal has been to provide a reasonably definitive solution of a con- 
crete problem of sequential optimization. Still, the natural expectation is that the 
solution of such a problem should also offer some novel methodological perspective. 
Here we began by exploiting the well-known idea of passing to the infinite horizon 
problem, but less commonly (and somewhat doggedly) we made the trek back from 
the infinite horizon problem to the finite horizon problem. In retrospect, that trek 
had elements of inevitability to it, but it also had surprises. 

In a natural and easy way the policy for the infinite horizon problem gave us a 
lower bound for the finite horizon problem, but our first surprise was the discovery 
(at first numerically) that the lower bound was so close to optimal. There was also 
something natural about the upper bound for the finite horizon problem, though at 
first we argued it by contradiction. The idea was that if we had a policy for finite 
horizon that was "too good" then one should be able to concatenate that policy to 
give a policy for the infinite horizon problem that would do better than our known 
optimal policy. The resulting contradiction would then provide an upper bound. 

This three-step process would seem to be applicable to many problems of optimal 
selection, though, from the details of our proof, it is clear that special features must 
be exploited. For example, without obtaining four relations in Lemma EJ we would 
not have been able to solve the infinite horizon problem. Three of these relations 
were straightforward, but the critical fourth relation still seems "lucky." We are 
also fortunate that symmetry relations simplified our Bellman equations. These 
simplifications have an intuitive basis from the alternating nature of the problem, 
but it seems fortuitous that these relations could be made rigorous by inductions 
(of several kinds) on the Bellman equation. 

There are many problems where one would like to go from the infinite horizon 
problem to the finite horizon problem, but one especially attractive is that of the 
optimal on-line selection of a monotone subsequence from a sample of independent 
observations. Here one knows the asymptotic behavior of the means for both finite 
samples (Samuels and Steele, 1981) and random samples — including geometric 
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sized samples — (Gnedin 1999; 2000). Most notably, in the infinite horizon case 
one has a precise understanding of the variance and even a central limit theorem 
(Bruss and Delbaen 2001; 2004). It would be quite interesting to know if such an 
analogous CLT can be obtained under the finite horizon formulation. 
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